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^ i Abstract 

In this paper, we obtain three different expressions for all higher 
■^ ' \ order derivatives of the permanent of a matrix and coefficients of its 

j^ . characteristic polynomial. Upper bound for the norms of the deriva- 

Mh I tives of the permanent is given. Norms of the derivatives for coeffi- 

^S^ • dents of the characteristic polynomial are evaluated exactly. 

-4— » 

a 

1 Introduction 

►^ , The Jacobi formula for the derivative of the determinant of a matrix is well 

^ I known. Recently, Bhatia and Jain have obtained analogous expressions for 

O ■ its higher order derivatives (See Q). They derive three formulas each of 

which is a generalisation of the Jacobi formula. 
1^^ _ In this note, we extend these results in two directions. First we obtain 

Q ' formulas for derivatives of all orders for the permanent function. Second, 

O . we obtain similar formulas for all the coefficients of the characteristic poly- 

nomial of a matrix. These formulas then lead to higher order perturbation 
bounds for the functions studied. 
S^ . Let A = (uij) be an n X n complex matrix. Let : M(n) ^^ C be a 

H I differentiable map. For each X e M(n), 



D«A)(.Y) = I 



(t>{A + tX). (1.1) 

i=0 



Then D0 is a linear map of M(n) into ^(M(n);C), the space of all lin- 
ear operators from M(n) into C. The second derivative oi (f> at A is the 
derivative of D(f> at A and is denoted by D'^(p{A). This is an element of 
.if(M(n);^(M(n);C)) which is identified with ^2(M(n);C), the space of 
bilinear mappings of M(n) x M{n) into C. Similarly, for any k, the A:*^ 
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derivative of cp at A, denoted by D'^(p{A), is an element of ^fc(M(n); C), the 
space of multilinear mappings of M(n) x • • • x M(n) into C (See (S]]). For 

X^,...,X'' GM(n), 



B^(I){A){X\...,X'') 



dti ■ ■ ■ dt)^ 



il=-=tfe=0 



(1.2) 



Notations. Let 

Qk,n = {(«!,• ••,«fc)|l < ii < ■ ■ ■ < ik < n} 

and 

Gk,n = {(n,---,^fc)|l <ii < ■■■ <ik <n}. 

For k > n, Qk,n = ^, by convention. Also, note here that for k < n, Qk,n is a 
subset of Gk^n- Given two elements X and J of G^ „, let A\1\J\ denote the 
kxk matrix whose (r,s)-entry is the (v,js)-entry of >1. \iX,J G Qk,n, then 
A\1\J\ is a submatrix of A. The rest of the notations are kept same as those 
inliZJ: 

IfX, J" G Qk,n, then we denoteby A(Z| J'), the {n — k)x (n — k) submatrix 
obtained from A by deleting rows 1 and columns J. The j*'^ column of the 
matrix X is denoted by Xyy Given n x n matrices X^,...,X^ and J = 
(ji) ■ ■ ■ :Jk) £ Qfc.nj we use A{J; X^, . . . , X'') to mean the matrix obtained 
from A by replacing the j*^ column of A by the j*^ column of X^ for 1 < 
p < k, and keeping the rest of the columns unchanged, that is, if Z = 
A{J; X\ . . . , X^), then Zy^j = Xg j for 1 < p < A;, and Z^q = A^] if I does 
not occur in J. Let cr be a permutation on k symbols, then by Y/j-,, we mean 

the matrix in which Y,'^ , = X7^^ for 1 < p < k and Y7, = if Z does not 

bp] bp] — f - [I] 

occur in J. 

2 Permanent 

The permanent of A, written as per {A), or simply per A, is defined by 

per^ = ^01^(1)02^(2) •••a„a(„), (2.1) 

(T 

where the summation extends over all the permutations of 1,2,. . . ,n. 

Let per : M(n) ^' C be the map taking an n x n matrix to its permanent. 
This is a differentiable map. We denote by D per A, the derivative of per at 
A. 

The famous Jacobi formula for determinant says that 

D det A{X) = tr (adj {A)X) , (2.2) 



where the symbol adj (^4) stands for the adjugate(the classical adjoint) of 
A. The permanental adjoint, denoted by padj (A), is the n x n matrix whose 
(z, j)-entry is peryl(i|j) (See [10], page 237). We obtain the following result 
similar to the Jacobi formula for determinant. 

Theorem 2.1. For each X G M(n), 

D per {A){X) = tr (padj {AfX). (2.3) 

Proof. For 1 < i < n, let A{j; X) be the matrix obtained from A by replacing 
the j*'* column of A by the j*^ column of X and keeping the rest of the 
columns unchanged. Then (12. 3D can be restated as 

n 

Dper(^)(X)=^per^(i;X). (2.4) 

i=i 

From (II. ID . we note that D per ^(X) is the coefficient of t in the polyno- 
mial per {A + tX). Using the fact that per is a linear function of each of its 
columns, we immediately obtain (12. 4D . 

D 

The Laplace expansion theorem for permanents ( IITTll . page 16) says that 
for any I e Qk,n, 

per^= Y. VeTA[I\J]peiA{I\J). (2.5) 

In particular, for any i,l < i < n, 

n 

per A = ^aij per (A(i|j)). (2.6) 

i=i 

Using this, equation (12. 4D can be rewritten as 

n n 

Dper(^)(X) = ^^Xi,perA(i|i). (2.7) 

The following two theorems are analogues of theorems 1 and 2 of flT]] 
and also generalisations of (12. 4D and (12. 7D respectively. Their proofs imitate 
the proofs in ^ . 



Theorem 2.2. Forl<k< 



n. 



D'=peryl(X\...,X^')= ^ ^ per^(:r;X"«,X"(2), . . . ,x"(^)). 

(2.8) 
/n particular, 

T)^^eTA{X,...,X) = k\ ^ perA(J^;X,...,X). 



Proof. From (I1.2D . it follows that D^ per ^(X\ . . . , X'') is the coefficient of 
ii • • • ifc in the expansion of per {A + tiX^ + • • • + t^X'^). Also using linearity 
of the per function in each of its columns, we obtain (I2.8D . D 

Theorem 2.3. Fori <k <n, 

B''pevA{X\...,X'')=Y^ J2 V'STA{l\J)pevY(^^[I\J]. (2.9) 

In particular, 

B''pevA{X,...,X) = k\ Yl peTA{I\J)peTX[I\J]. 

Proof. For each J & Qkn, the Laplace expansion theorem gives 



per^(J;X'^«,...,X'^('=))= ^ per^(X|J) pery[;^][X|^]. 

l£Qk,n 

Equation ( I2.9D is obtained by expanding each term in the summation of 



(IZSl) in this way D 

We note here that 

D'"perA(X,...,X) =n!perX (2.10) 

and 

B''peTA{X,...,X) =0 yk>n. (2.11) 

We now describe a generalisation of (12. 3D . Let Ti be an n-dimensional 
Hilbert space. Let ^''Ti be the k-fold tensor power of H and y'^Ti be the 
symmetric tensor power of Ti. If {cj} is an orthonormal basis of Ti, then 
for I = (ii, . . . ,ik) G Gfc^nj we define ej = Cj^ V • • • V Cj^,. If X consists 
of I distinct indices ii, . . . ,ii with multiplicities mi, . . . ,mi respectively, put 
m{I) = mil ■ ■ ■ mil. Note that if I e Qk,n, then m{I) = 1. An orthonormal 
basis of y'^Ti is {m(X)^^/^ei : I e G^t „}. It is conventional to order these 
multi-indices lexicographically. (See [4], Chapter 2.) 

Let y'^A denote the kth symmetric tensor power of A. With respect to the 
above mentioned basis, the (X, ^)-entry of v'^^ is {m{T)ni{J))^^^'^ per A\I\J]. 
Let Pk be the canonical projection of M^l-L onto the subspace generated by 
{ex : 1 G Qk,n}- If we vary X, J' in (5fc,n) we get the submatrix Pk{\/^A)Pk 
oiy^A. 

The matrix padj(A)^ can be identified with a submatrix of an opera- 
tor on the space y^^^U. We call this operator \/"-'^A. Then padj(A)'^ = 
P„_i(v"~^yl)P„_i, which is an n x n matrix. It is unitarily similar to the 
transpose of the submatrix P„_i(V"'^>l)P„_i of V"^^A. Similarly, the trans- 
pose of the matrix whose (X, J')-entry is per A(X| J') can be identified as a 



submatrix of an operator on the space y^^^T-L. We call this operator v"^ A. 
The (^) X (^) submatrix Pn-k{y'^~^ A)Pn-k of \j"-^^A is unitarily similar to 
the transpose of the submatrix P„_fc(V""^^)P„_fc of V""''v4. 
Equation (I2.3D can also be written as 

Dper^(X) =tr(P„_i(v""^A)P„_i)X. (2.12) 

Let X^ . . . , X^ G ^[H). Consider the following operator on (^^U: 

-3- y X'^W(g)X'"(2)^...^^<x{fc)^ (2.13) 

It leaves the space W^Ti invariant. We use the notation X^ V X'^ \/ ■ ■ ■ V X'' 
for the restriction of this operator to the subspace y'^'H. 
The generalisation of (12.12ft is given as follows: 

Theorem 2.4. For I < k < n, 

B'' peTAiX\ ...,X>') = k\tr [(P„_fc(v"-*=^)P„_fc)(Pfc(Xi V • • • V X^)Pfc)]. 

(2.14) 
In particular, 

B'' perAiX, ...,X)=k\tr [(P„_fc(v"-'=A)P,_fc)(Pfc(v'=X)Pfc)]. 

Proof. Note that 

Y(^^[I\J] = (< e,,,X-('")e,„ >)i<i,m<k 

and (X, J') -element of Pk{X^ V • • • V X^)Pk, where X = {ii, . . . ,ik) and 
J = Ui,-- -Jk) e Qk,n, is given by 

<ex,{Pk{X^V---VX'')Pk)ej> 
= < Ci, V • • • V ei,, (XW • • • V X^){ej, V • • • V e^J > 

Also, (J',X)-element of P„_fc(v""''A)P„_fe is peryl(XU), by definition of 
-„_fc^_ Using all this in ( [Z9l ), we obtain (|2T4l) . D 

Bhatia and Dias da Silva (Theorem 1, (Sj) have obtained norm oiDKx{A) 
where Kx is the restriction of the map ^^A to the symmetry class of tensors 
associated with A and 5m- A particular case is theorem 1 in [1] which says 
that 

||D V^ (^)ll = fcpf "^ VI < fc < n. (2.15) 

It follows from (I2.15D that 

||DperA|| <np||"-^ (2.16) 

We extend this to the following: 



Theorem 2.5. Let A be an n x n matrix, we have 

\\B^peiA\\<k]lyj\\Ar-''. (2.17) 

Proof. To show this, we first note that if ^4 is an n x n matrix, then 

Pll < Pill <n||A||. (2.18) 

Also, if si{A) > ■ ■ ■ > Sn{A) are the singular values of A, then the singular 
values of V^y4 are Sj^ • • • Sj^,, where (ii, . . . , i„) vary over Gk^n- So, 

II v'^ylll =si(^)^ (2.19) 

Now, by definition, 

||D'=perA||= sup \\B'' -per A{X\ . . . ,X'')\\. (2.20) 

||Xij|=--- = ||X'=||=l 

Using fOAAh and the facts if ||XJ|| = 1 for allj then \\Pk{X^y ■ ■ -y X'')Pk\\ < 
1 and norm of a submatrix of a matrix is less than or equal to the norm of 
the matrix, we get 



\D^perA\\=kl sup | tr [(P„_fc(v"-'^A)P,_fc) 

||Xi|| = ---=||X'=||=l 

(P,(Xlv---VX'=)Pfc)]| 
<A;!||P„_fe(v"-'=A)P„_fc||i 

< kl r] ||P„_fe (V"-'^^) P„_fc II (using dSIlD) 

<A;!rj||v"-'^A|| 

=^-!(:)iiv'-^ii 

= 4:) II All'-. 



D 



As a corollary, we obtain a perturbation bound for per. 



Corollary 2.1. Let A and X be n x n matrices. Then 



n / \ 

n 



per (^ + X) - per ^1 < ^ p|r-^||Xf . (2.21) 



Proof. This follows using Taylor's theorem. 

k\ 



1 
per(A + X)-perA| = \\^—D^^&iA{X,...,X) 

fc=i 



n , 

fc=l '*^- 

fc=i v^y 



n 

Consider the simplest commutative case: A = I, X = xl. Then the 
expression on both the sides of the inequality (I2.21D is 



k=i \'^ 



x". 



So no improvement of the corollary is possible in this sense. 

3 Coefficients of the characteristic polynomial 

The characteristic polynomial of A, by definition, is 

^n _ g^^n-1 ^ ^^^n-2 _ _ ^ ^ ^ (-l)"^^, (3.1) 

where the coefficient gr is the sum of r x r principal minors of A. In par- 
ticular, gi is the trace of A and gn is the determinant of A. We consider 
gr : M(n) ^ C as the map taking a matrix to the r^^ coefficient of its char- 
acteristic polynomial. Then 

griA)= Yl detAx, 

where Aj denotes the submatrix A[I\I] oi A. 

Fori = (ii, i2, • • • , v) G Qr,n, let hj denote the map which takes an n x n 
matrix A to Ax- It is a linear map. Then, 

gr{A)= E (deto%)(^). (3.2) 

We derive three different expressions for higher order derivatives of the 
coefficients of the characteristic polynomial of A, which follow as corollaries 
to the theorems in [ 7 ] . We first give a lemma by which all three of them 
follow immediately. 



Lemma. If f and g are two maps such that fog is well defined and g is linear, 
then 

BHf o 9){A){X\. . . , X') = B'^f{g{A)){g{X'), . . . , g{X'')). (3.3) 

Proof This follows by (lOD . D 

From theorems 1 and 2 of Q, we obtain 
Theorem 3.1. For 1 < k,r <n, 

(3.4) 
Theorem 3.2. For 1 < k,r < n, 

DV(^)(^^•••,^') = E E E(-l)'^'^""det(Ax)(/C|J) 

det{Yfj^)i[}C\J]. (3.5) 

We now give another expression for the derivatives of gr using theorem 
3 of Q. Let a'^H denote the antisymmetric tensor power of Ti. If {cj} 
is an orthonormal basis of H, then for 1 = (ii, . . . ,ik) G Qk,n, we define 
ex = Cj^ A • • • A Ci^.. Then, {ex : X € Qfc.n} is an orthonormal basis of a'^T-L. 
It is conventional to order these multi-indices lexicographically. (See Q, 
Chapter 2.) A^A is the A;th antisymmetric tensor power of A. With respect 
to the above mentioned basis, the {Z,J)-entry of a'^A is det^[I|J']. The 
transpose of the matrix with entries (— i)l^l+l'5'l detA{I\J) can be identified 
with an operator on the space a'^'H. We call this operator a"~ A and note 
that it is unitarily similar to the transpose of A"'~''A. 

For X'^,...,X'' e ^CH), consider the operator given by (IZTSl) . This 
leaves the space a'^H invariant. We use the notation X^ A X^ A • • • A X'^ for 
the restriction of the operator (I2.13D to the subspace A^H. 

Theorem 3.3. For I < k,r < n, 

D''gr{A){X\...,X'') = kl J2 tv[{A'-''Ax){X^A---AX^)]. (3.6) 



In particular, 



D'gr{A)iX,...,X) = kl Y. triA'-''Ax){A'Xx)]. 



Let si{A) > ■ ■ ■ > Sn{A) be the singular values of A, and let \\A\\ := 
si{A) be the operator norm oi A. Also, 

||D5,(^)||= sup ||Dr7,(^)(X)||. (3.7) 

||X||=1 

The trace norm of A is defined as 

\\A\\i = si{A) + --- + Sn{A). (3.8) 

This is the dual of the operator norm O, Chapter 4). So, 

ll^lli = sup |trylX[. 

||X||=1 

For 1 < k < n, let pk{xi, . . . , x„) denote the kth elementary symmetric 
polynomial in n variables. We now give the exact norm of D''gr{A), using 
theorem 4 of . 

Theorem 3.4. Let A be an n x n matrix and let si{Ax), . . . , Sr{Ax) be the 
singular values of Ax- Then, 

||DV(^)ll=fc! E Vr-k{sx{Ax)....,sMx)). (3.9) 

Proof. For every X G Qr,n, let Ax = UAjV be the singular value decompo- 
sition of Ax. Using (fOj) and (|3T2]) . we get 

D''gr{A){X\...,X'')= E G^detA+{U*X^V*,...,U*X^V*). 

It follows from here that 

IID'^ 5^(^)11 = \\D''detA^\\. 
Theorem 4 of HT]] now gives the desired result. D 

As a corollary we have the following perturbation bound for gr . 
Corollary 3.1. Let A and X be n x n matrices. Then, 

r 

\gr{A + X)-gr{A)\< ^ Y.Pr~k{siiAx), . . . ,SriAxmxt ■ (3.10) 

leQr.n fc=i 

Proof. This is again a consequence of Taylor's theorem. 

gr{A + X)=gr{A) + J2^^''9r{A){X,...,X) + 0{\\X\\^+'). (3.11) 
It follows from here that 



.=1^! 



\g,{A + X)-gr{A)\<J2^\\B'^gr{A)\\\\Xf. 



fe=l 



Using theorem 3.4 in this, we get (I3.10D . D 



In the simplest commutative case where A = I and X = xl, both sides 
of (I3J0D equal 



E 



'"Vx^ 



, r 

k=i \ / 



A weaker perturbation bound can be obtained as follows. 
Corollary 3.2. For n x n matrices A and X, 



r I 

n 



g,{A + X)-gr{A)\<Y,[ P|r-^||Xf. (3.12) 



, r 

k=i \ 



Proof. This follows by corollary 2 and by using the facts that ||^|| = si(^) 
and norm of a submatrix of a matrix is less than or equal to the norm of the 
matrix. D 
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